Automatic identification of confusable drug names

نویسندگان

  • Grzegorz Kondrak
  • Bonnie J. Dorr
چکیده

OBJECTIVE Many hundreds of drugs have names that either look or sound so much alike that doctors, nurses and pharmacists can get them confused, dispensing the wrong one in errors that can injure or even kill patients. METHODS AND MATERIAL We propose to address the problem through the application of two new methods-one based on orthographic similarity ("look-alike"), and the other based on phonetic similarity ("sound-alike"). In order to compare the effectiveness of the new methods for identifying confusable drug names with other known similarity measures, we developed a novel evaluation methodology. RESULTS We show that the new orthographic measure (BI-SIM) outperforms other commonly used measures of similarity on a set containing both look-alike and sound-alike pairs, and that a new feature-based phonetic approach (ALINE) outperforms orthographic approaches on a test set containing solely sound-alike pairs. However, an approach that combines several different measures achieves the best results on two test sets. CONCLUSION Our system is currently used as the basis of a system developed for the U.S. Food and Drug Administration for detection of confusable drug names.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of Confusable Drug Names: A New Approach and Evaluation Methodology

This paper addresses the mitigation of medical errors due to the confusion of sound-alike and look-alike drug names. Our approach involves application of two new methods— one based on orthographic similarity (“lookalike”) and the other based on phonetic similarity (“sound-alike”). We present a new recall-based evaluation methodology for determining the effectiveness of different similarity meas...

متن کامل

Hidden Markov models and selectively trained neural networks for connected confusable word recognition

This paper presents a new method for connected-word recognition with confusable vocabularies, such as connected letters. The recognition process is performed in two steps. First, a second-order HMM provides N-best word strings. Then, the strings of confusable letters are discriminated by a procedure based on acoustic knowledge and artificial neural networks (ANN). This method has been tested on...

متن کامل

Identification of Chinese Personal Names in Unrestricted Texts

Automatic identification of Chinese personal names in unrestricted texts is a key task in Chinese word segmentation, and can affect other NLP tasks such as word segmentation and information retrieval, if it is not properly addressed. This paper (1) demonstrates the problems of Chinese personal name identification in some IT applications, (2) analyzes the structure of Chinese personal names, and...

متن کامل

Protein names and how to find them

A prerequisite for all higher level information extraction tasks is the identification of unknown names in text. Today, when large corpora can consist of billions of words, it is of utmost importance to develop accurate techniques for the automatic detection, extraction and categorization of named entities in these corpora. Although named entity recognition might be regarded a solved problem in...

متن کامل

An Approach for Automatic Matching of Descriptive Addresses

Address matching (also called geocoding) is an applied spatial analysis which is frequently used in everyday life. Almost all desktop and web-based GIS environments are equipped with a module to match the addresses expressed in pre-defined standard formats on the map. It is an essential prerequisite for many of the functionalities provided by location-based services (e.g. car navigation). Sever...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Artificial intelligence in medicine

دوره 36 1  شماره 

صفحات  -

تاریخ انتشار 2006